Kourier Integrator Online Help

Best Practices for Building a Data Warehouse using Kourier

Refer to the following best practices when you are using Kourier to build a data warehouse.

Database Object Names

Choose a format for how database objects will be named. The two most common approaches are camel case and snake case. Some studies have shown that snake case is a little bit easier to read but in the end it comes down to a personnel preference. Whatever you choose, be consistent. The following table shows examples of camel case and snake case.

Camel Case

Snake Case

HomeTerritory

home_territory

Bill2Address1

bill2_address1

Common Abbreviations

Establish a common set of abbreviations which matches the parlance of our application and industry. The goal is to be descriptive but to avoid the need for typing out long words. Try to keep the list as small as possible to avoid having a cheat sheet to decipher the abbreviations. The following table contains examples of common abbreviations.

Term/Word

Abbreviation

Manufacturer

Mfg

Amount

Amt

Fields Names with Common Meanings Across Files

If a field has a common meaning, the field name should end in a consistent way for each export. This is common for address lines. So if you have a billto_address1 field, you should NOT have a soldto_addr1 field. It should be soldto_address1.

Qualify Field Names only when Necessary.

Only qualify Field names when necessary. For example, in a customer file you might have a customer name/address fields. Avoid creating field names like customer_name, customer_address1, and so on since the table name qualifies the meaning of name. This results in fields like: name, address1, and so on.

Some fields in a SQL will be Foreign Key Field Names

Some fields in a SQL table will be foreign keys to other tables in the data warehouse. You should terminate those field names with something unique like ID and/or CD. This makes it easier for end users to deduce that a field links to another table.

Camel Case

Snake Case

ProductID

product_id

CustomerID

customer_id

StateCD

state_cd

Dates and Times

In U2 style database, a field will contain just a date or just a time. If a field contains both pieces of information, it is usually implemented as two separate fields or as a concatenated string (e.g. 16759*56091). In any case, choose a field name that describes what the field contains. For example, choose start_date if the field only contains a date and choose state_datetime if the field contains both pieces of information. Consider using the D conversion to transform an internal date into its external representation. We have found this works best when date formats can change by locals.

Numeric Fields

In U2 style database, an empty string is evaluated as zero (0). However, in SQL a null field is very different than a field that contains a value of zero. Use the S;*;’0’ conversion to force empty strings in U2 to zero. This makes doing arithmetic much easier in SQL queries.

Character Fields

Always use VARCHAR instead of CHAR. Be consistent in choosing field lengths for information that has the same meaning. For example, if an address line can be 30 characters, make sure all address lines are defined as 30 characters. This is especially important for primary keys/foreign keys to facilitate efficient joining across SQL tables.

Decompose Multi-part Strings/Keys

Always decompose fields that contain multi-part strings. This will make it much easier to do SQL joins across tables.

Decompose Overloaded Attributes into Separate Fields

If your application has an attribute where each value or sub-value means something different, break each value or sub-value into separate files.

Make Boolean Fields Consistent

It is quite common in a U2 database to have fields that represent yes/no or true/false that are implemented inconsistently. Sometimes yes is represented as a 1 while other times yes is represented as Y. Convert these types of fields into either Y or N. Y/N is preferable to 0/1 since you want your end-users to easily understand the values in the field.

Don’t Send Cross Reference Fields

If your application has cross reference fields in a primary field to aid reporting in a secondary file, do not export these fields. Instead use the power of SQL joins to present information from both tables.

Text/Comment Fields

It is quite common in a U2 database to implement a text field or comment field as a delimited field using system delimiters like @VM, @SVM or @TM. Use the @TEXT conversion to convert these fields into a string that can be inserted into a VARCHAR(nnnn) field in SQL. If the value of nnnn is less than 8,000 characters, the SSIS package will be able to use the fast-load capability of SQL instead of doing row-by-row updates. Consider using the UTRUNCATE conversion to flag truncated data in SQL.

Include Housekeeping Fields in each Export

Include an export_datetime and natural_key field in each export. The export_datetime field is useful if you plan on creating summary files. The natural_key is useful for debugging data quality issues. Oftentimes, this field is named something like eclipse_key or prelude_key. Note that it does not end with ID or CD since it is not intended to be used in join operations.

Aggregate Year-based Files

If you application uses separate files to denote a specific years’ worth of data (common for general ledger files), aggregate that data together in a single SQL table using the related files feature of Kourier. Having the date in a single table in SQL facilitates comparative reporting.

Aggregate Files Across Accounts

If your application uses separate accounts to segregate data by location, aggregate that data together in a single SQL table using the related files feature of Kourier. Having the data in a single table in SQL facilitates comparative reporting.

Create Standard Abbreviations for Surrogate Keys

Kourier will create surrogate keys when it normalizes the multi-valued or multi-subvalued information in a U2 file. Create a common abbreviation used across all tables when necessary. Common abbreviations are: vmc, svmc, line_no, subline_no. Note that in some child/grand-child situations, that these terms represent foreign keys to the other table so if we are following the rules, they should end in ID or CD. However, these types of fields are normally treated as the exception to the rule.

Using Virtual Fields

Only include virtual fields in an U2 to SQL export if the field can be calculated using only the information in the source file. This is necessary to ensure that Kourier’s net change works properly. However, if you can ensure that when a child record in a separate file is changed, it always causes a change in the parent record in a separate file. Then you can include both files as if they were the same file.

Allow Users to only Access Views for Reporting

Only allow end-users to access information in the data warehouse via SQL views. This allows you to hide the complexities of common calculations that you might want to include in the view and it provides a way to enforce that in all end-user queries including the SQL compiler hit of WITH(NOLOCK). Using the WITH(NOLOCK) directive avoids potential issues with two or more queries waiting for the first query to complete and avoids lock contention when Kourier SSIS packages run.

Related Topics: 

U2 to SQL Export

ODBC to SQL Export

Quick Start Workbench

Quick Start Schema Editing

Quick Start Configuration

Generate SSIS Package